Method for efficient coding of shape descriptor parameters

ABSTRACT

A method of representing an object appearing in a still or video image, by processing signals corresponding to the image, comprises deriving a plurality of sets of co-ordinate values representing the shape of the object and quantising the co-ordinate values to derive a coded representation of the shape, and further comprises quantising a first co-ordinate value over a first quantisation range and quantising a smaller co-ordinate value over a smaller range.

[0001] The present invention relates to the representation of an objectappearing, for example, in a still or video image, such as an imagestored in a multimedia database, and particularly to the coding of sucha representation.

[0002] In applications such as image or video libraries, it is desirableto have an efficient representation and storage of the outline or shapeof objects or parts of objects appearing in still or video images. Aknown technique for shape-based indexing and retrieval uses CurvatureScale Space (CSS) representation. Details of the CSS representation canbe found in the papers “Robust and Efficient Shape Indexing throughCurvature Scale Space” Proc. British Machine Vision conference, pp53-62, Edinburgh, UK, 1996. and “Indexing an lmage Database by ShapeContent using Curvature Scale Space” Proc. IEE Colloquium on IntelligentDatabases, London 1996, both by F. Mokhtarian, S. Abbasi and J. Kittler,the contents of which are incorporated herein by reference.

[0003] The CSS representation uses a curvature function for the outlineof the object, starting from an arbitrary point on the outline. Thecurvature function is studied as the outline shape is evolved by aseries of deformations which smooth the shape. More specifically, thezero crossings of the derivative of the curvature function convolvedwith a family of Gaussian filters are computed. The zero crossings areplotted on a graph, known as the Curvature Scale Space, where the x-axisis the normalised arc-length of the curve and the y-axis is theevolution parameter, specifically, the parameter of the filter applied.The plots on the graph form loops characteristic of the outline. Eachconvex or concave part of the object outline corresponds to a loop inthe CSS image. The co-ordinates of the peaks of the most prominent loopsin the CSS image are used as a representation of the outline.

[0004] To search for objects in images stored in a database matching theshape of an input object, the CSS representation of an input shape iscalculated. The similarity between an input shape and stored shapes isdetermined by comparing the position and height of the peaks in therespective CSS images using a matching algorithm.

[0005] The number of bits required to express the properties of thecontour shape in a descriptor should be as small as possible tofacilitate efficient storage and transmission.

[0006] Aspects of the present invention are set out in the accompanyingclaims.

[0007] The invention can offer a very compact representation (in termsof number of bits used for storage) without any significantdeterioration in retrieval performance.

[0008] Embodiments of the present invention will be described withreference to the accompanying drawings of which:

[0009]FIG. 1 is a block diagram of a video database system;

[0010]FIG. 2 is a CSS representation of an outline;

[0011]FIG. 3 is a diagram illustrating coding of co-ordinate values ofthe CSS representation.

[0012]FIG. 1 shows a computerised video database system according to anembodiment of the invention. The system includes a control unit 2 in theform of a computer, a display unit 4 in the form of a monitor, apointing device 6 in the form of a mouse, an image database 8 includingstored still and video images and a descriptor database 10 storingdescriptors of objects or parts of objects appearing in images stored inthe image database 8.

[0013] A descriptor for the shape of each object of interest appearingin an image in the image database is derived by the control unit 2 andstored in the descriptor database 10. The control unit 2 derives thedescriptors operating under the control of a suitable programimplementing a method as described below.

[0014] Firstly, for a given object outline, a CSS representation of theoutline is derived. This is done using the known method as described inone of the papers mentioned above.

[0015] More specifically, the outline is expressed by a representationΨ={(x(u), y(u), uε[0, 1]) where u is a normalised arc length parameterand (x,y) are co-ordinates of the points on the object contour.

[0016] The outline is smoothed by convolving Ψ with Gaussian kernel g(u,σ) or similar kernel, and the curvature zero crossings of the evolvingcurve are examined as σ changes. The zero crossing are identified usingthe following expression for the curvature:${k\left( {u,\sigma} \right)} = \frac{{{X_{u}\left( {u,\sigma} \right)}{Y_{uu}\left( {u,\sigma} \right)}} - {{X_{uu}\left( {u,\sigma} \right)}{Y_{u}\left( {u,\sigma} \right)}}}{\left( {\left( {X_{u}\left( {u,\sigma} \right)} \right)^{2} + \left( {Y_{u}\left( {u,\sigma} \right)} \right)^{2}} \right)^{3/2}}$

[0017] where

X(u, σ)=x(u)*g(u, σ) Y(u, σ)=y(u)* g(u, σ)

[0018] and

X_(u)(u, σ)=x(u)*g_(u)(u, σ) X_(uu)(u, σ)=x(u) * g_(uu)(u, σ)

[0019] In the above, * represents convolution and subscripts representderivatives.

[0020] The number of curvature zero crossings changes as a changes, andwhen σ is sufficiently high Ψ is a convex curve with no zero crossings.

[0021] The zero crossing points (u, σ) are plotted on a graph, known asthe CSS image space. This results in a plurality of curvescharacteristic of the original outline. The peaks of the characteristiccurves are identified and the corresponding co-ordinates are extractedand stored. In general terms, this gives a set of n co-ordinate pairs[(x1,y1), (x2, y2), . . . (xn,yn)], where n is the number of peaks, andxi is the arc-length position of the ith peak and yi is the peak height.

[0022] The order and position of characteristic curves and thecorresponding peaks as they appear in the CSS image space depends on thestarting point for the curvature function described above. The peakco-ordinates are re-ordered, as described below.

[0023] Let us assume that the contour from which parameter are extractedhas n peaks, with the peak parameters forming a set {x¹,y¹), (x², y²), .. . , (x^(n),y^(n))} as shown in FIG. 2. The peaks are then orderedbased on height (that is the y co-ordinate value) in either increasingor decreasing order {(x₁,y₁), (x₂,y₂), . . . , (x_(n),y_(n))}(subscripts denote the peak order number after ordering). Let us assumethat peaks are ordered in decreasing order, so that the highest peak isthe first one (x₁y₁), and each of the subsequent peaks is lower than orequal its predecessor in the set (FIG. 3).

[0024] These re-ordered peak co-ordinates form the basis of thedescriptor for the object outline. Additional parameters of the shape,such as circularity C, eccentricity E and compactness D, some extractedfrom the so called “prototype contour shape” can also be computed andstored to be used in the matching process as described in co-pendingapplication no. GB 9916684.5, the contents of which are incorporatedherein by reference.

[0025] Next, coarse quantisation of the peak heights is performed. Therange over which quantisation is performed is different for each peak,and depends on the values of the higher peaks (eg heights of the peakswhich are predecessors in the ordered set).

[0026] Referring to FIG. 3, the first peak is quantised over a rangeI1=[0,Y_(max)], where Y_(max) is the maximum value for the peak that isexpected on a certain class of shapes. Each of the remaining peaks isquantised to the range which depends on the value of one or several ofthe previous peaks. For example, peak y₂ is quantised over the intervalI2=[0,y₁], (FIG. 3) peak y₃ over the interval [0, y₂] etc.

[0027] In this embodiment, the first peak is quantised over the interval[0,1024 ] using 7 bits and the remaining peaks are quantised to 3 bitsover the appropriate respective range. as discussed above. Supposing theheight of the first peak is 893, say, then y₂ is quantised over therange [0,893], using 3 bits, and so on. Accordingly, for peaks y₂ to y₅,the quantisation interval is reduced, giving greater accuracy despiteusing fewer bits. The x position of each peak are quantised to 6 bitsuniformly distributed on [0,1) interval. The x value may be the originalx value, as shown for example, in FIG. 2, or after shifting along the xaxis by an amount such that the x value for the highest peak is at 0.

[0028] Let us examine the gain from the presented invention. In theconventional solution each peak requires two floating point numbers, 4bytes each. Thus, for a typical shape with 9 peaks, the storagerequirement is 9*2*4=72 Bytes (576 bits). After application of theproposed embodiment. the first peak requires 7 bits, assuming that the xvalue is treated as zero, and each consecutive peak 6+3 bits, thus 79bits in total.

[0029] Instead of a range [0,y_(i)], a range [0, R(y_(i))) could beused, where R (y_(i)) is the reconstruction of the value y_(i) afterinverse quantisation.

[0030] An alternative embodiment which will have a similar effect is todivide the height of each of the peaks {y2, y3, . . . , yn} (except thehighest one) by the value of the respective previous peak. After thisoperation, the range of all yi is from the set (0,1]. This allows theuse of much coarser quantisation for all yi.

[0031] In either example, good results can be obtained by using 7 or 6bit quantisation for the highest peak and 4 or 3 bit quantisation forall the remaining peaks. Other numbers of bits can be used.

[0032] The above operations can also be performed after the coordinatevalues have been subjected to a binormal filtering and a non-lineartransformation, as described in co-pending application no. GB 9915699.4,the contents of which are incorporated herein by reference. The xco-ordinates can be coded along the lines described above instead of oras well as the y values.

[0033] The resulting values can be stored for use, for example, in asuitable matching procedure, such as described in our co-pendingapplications GB 9915699.4, GB 9915698.6 and GB 9916684.5, withappropriate modifications, for example, by performing inversequantisation on the stored descriptors before performing matching.

1. A method of representing an object appearing in a still or videoimage, by processing signals corresponding to the image, the methodcomprising deriving a plurality of sets of co-ordinate valuesrepresenting the shape of the object and quantising the co-ordinatevalues to derive a coded representation of the shape, wherein, for afirst set of co-ordinate values having a value for a given co-ordinateless than the corresponding co-ordinate value in a second set ofco-ordinate values, the range over which the given co-ordinate value isquantised is less for the first set of co-ordinate values than for thesecond set of co-ordinate values.
 2. A method as claimed in claim 1wherein the quantisation range for the given co-ordinate value of thefirst set of co-ordinate values is based at least on the correspondingco-ordinate value of the second set of co-ordinate values.
 3. A methodas claimed in claim 2 wherein, for a sequence of decreasing co-ordinatevalues, the quantisation range for each co-ordinate value is based onone or more of the preceding, higher, co-ordinate values, where theyexist.
 4. A method as claimed in any one of claims 1 to 3 wherein thenumber of bits allocated to the quantised representation of the givenco-ordinate value for the first set of values is less than the number ofbits allocated to the quantised representation of the correspondingco-ordinate value for the second set of values.
 5. A method as claimedin any one of claims 1 to 3 wherein the number of bits allocated to thequantised representation of the given co-ordinate value for the firstset of values is same as the number of bits allocated to the quantisedrepresentation of the corresponding co-ordinate value for the second setof values.
 6. A method as claimed in any preceding claim wherein thesets of co-ordinate values are co-ordinate pairs, and the quantisationrange varies for at least one co-ordinate value of each pair ofco-ordinate values.
 7. A method as claimed in claim 6 wherein theco-ordinate pairs correspond to the positions of the peaks in a CSSrepresentation of a shape.
 8. A method as claimed in claim 7 wherein thevarying quantisation range is used for the co-ordinate valuecorresponding to the peak height.
 9. A method as claimed in any one ofthe preceding claims wherein the quantisation range is the same for aplurality of co-ordinate values.
 10. A method as claimed in anypreceding claim comprising the step of ordering co-ordinate values indecreasing or increasing size.
 11. A method of representing an objectappearing in a still or video image, by processing signals correspondingto the image, the method comprising deriving a plurality of sets ofco-ordinate values representing the shape of the object and quantisingthe co-ordinate values to derive a coded representation of the shape,the method further comprising quantising a first co-ordinate value overa first quantisation range and quantising a smaller co-ordinate valueover a smaller range.
 12. A method as claimed in claim 11 wherein atleast one co-ordinate value is quantised using fewer bits than a higherco-ordinate value.
 13. A method as claimed in claim 11 or claim 12wherein at least one co-ordinate value is quantised using the samenumber of bits as a higher co-ordinate value.
 14. A method as claimed inany one of claims 11 to 13 wherein the quantisation range for aco-ordinate value is dependent on at least a higher co-ordinate value.15. A method of searching for an object in a still or video image byprocessing signals corresponding to images, the method comprisinginputting a query object. deriving a representation of the query object,comparing the representation with representations derived using a methodas claimed in any preceding claim, and selecting and displaying thoseobjects for which the representations indicate a degree of similarity tothe query.
 16. An apparatus adapted to implement a method as claimed inanv preceding claim.
 17. A computer program for implementing a method asclaimed in any one of claims 1 to
 15. 18. A computer system programmedto operate according to a method as claimed in any one of claims 1 to15.
 19. A computer-readable storage medium storing computer-executableprocess steps for implementing a method as claimed in any one of claims1 to
 15. 20. A method of representing an object of an outlinesubstantially as hereinbefore described as an embodiment with referenceto the accompanying drawings.